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Studies of highly-boosted top quarks produced inclusively in pp collisions at y/s = 14 TeV are 
discussed. Using Monte Carlo models after a fast detector simulation, it is shown that jet masses 
alone provide a sensitive probe for top quarks produced inside high-pr jets. The hadronic decays of 
such top quarks can be studied in a data-driven approach by analysing shapes of jet-mass distribu- 
tions. It is shown that inclusive production of boosted top quarks can be observed if it has a cross 
section at least twice larger than the prediction from the approximate next-to-next-to-leading-order 
(aNNLO) calculation for the tt process. The tt process with the nominal aNNLO strength can be 
measured using the masses of jets after a fa-tagging. 

PACS numbers: 14.65.Ha, 12.38.-t 



I. INTRODUCTION 

Heavy particles with masses above a TeV decaying to 
top quarks can lead to an enhanced cross section of top 
quarks compared to the Standard Model expectations. 
The fact that such a cross section can be more than 
doubled for top quarks with high transverse momenta 
(p-r (top)) was recognized [1] almost immediately after 
the discovery of top quarks at the Tevatron. However, 
the Standard Model predictions on top quark cross sec- 
tion have not yet been confronted with experimental data 
for transverse energies close to the TeV scale. 

According to the Standard Model, inclusive production 
of top quarks is dominated by the tt process. Top-quark 
production includes contributions from single top quark 
processes (t- and s-channcls) and from Wt. Top quarks 
can also be produced via associated Higgs production. 
Finally, top quarks at very large pr(jet) can originate 
from fragmentation, but no data exist to constrain this 
process. 

Currently, there are several high-p^ measurements of 
top quarks focusing on the tt event topology. The DO 
collaboration has reported the tt cross section up to 
p T (top) = 350 GeV [2]. The CDF collaboration [3] per- 
formed searches for highly-boosted top quarks, but statis- 
tics was insufficient to support the claim for observation 
of top-quark production at pr(top) > 400 GeV. At the 
LHC, ATLAS performed [4] searches for Z' extending 
the reach in pr(top) up to 500 GeV, but without cross 
section measurements. CMS recently measured the top 
quark px distribution up to pr(top) = 400 GeV [5]. 

The measurement of top-quark cross sections at very 
large transverse momenta is challenging. For large jet 
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transverse momenta, the identification of leptons (muons 
and electrons) from the W decay is difficult since they 
are often collimated with b— jets from the top decays. 
This leads to a reduced electron efficiency due to isola- 
tion requirements and large fake rates for muons due to 
the presence of 6-quark decay products. In addition, a 
^-tagging technique suffers from an inefficiency for large 
_Pt (jet) and poor separation between the signal and mul- 
tijet background events. 

Because of the above reasons, the main focus of this 
analysis is the hadronic-final state characteristics of jets 
which are expected to be sensitive to the production of 
hadronically decaying top quarks with large px (top). For 
such studies, jet masses and jet shapes are often discussed 
as a useful tool for the identification of top quarks and 
for reduction of the overwhelming rate from conventional 
QCD processes [6, 7]. 

In this paper, we adopt a strategy based on a high- 
precision measurement of shapes of jet masses. Us- 
ing realistic Monte Carlo (MC) simulations after a fast 
detector simulation, we show that hadronic decays of 
highly-boosted top quarks can be observed by perform- 
ing a data-driven analysis of jet- mass shapes near the 170 
GeV region, without any additional technique involving 
jet substructure variables. This article shows that this 
method becomes feasible if the top-quark yield in the 
fiducial region pr(top) > 0.8 TeV is a factor two or more 
larger than the expectation from the best understood tt 
process. Given large theoretical uncertainties for the tt 
process at large pr(top) and a number of other not well 
understood sources (see Sect. I A) contributing to top 
quark production at large pt (top), this approach can 
be promising for observation of inclusively produced top 
quarks. Moreover, we also demonstrate that a 6— tagging 
can substantially increase the signal-over-background ra- 
tio, leading to observation of top jets from tt. 
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FIG. 1. The NLO and aNNLO cross sections for the num- 
ber of top quarks in the tt process as a function of transverse 
momenta for \t)\ < 0.8. The hatched area shows the renormal- 
ization scale uncertainty for NLO, while the filled green area 
shows the PDF uncertainty (see the text of Sect. I A). The 
dashed line shows the NLO cross section for y^s = 8 TeV. 



A. Theoretical calculations for inclusive top 
production 

There are several Standard Model processes contribut- 
ing to inclusive top-quark production at large pr(]Qt). 
The best studied process is the tt process when each 
boosted top quark gives origin to a jet. Single-top pro- 
duction (t— and s— channels) and top-quark associated 
production are other sources of top-quark jets. In this 
case, no second jet originated from a hadronically decay- 
ing top quark is expected. Top quarks within a single 
high-pT jet can be produced due to flavor-changing pro- 
cesses and fragmentation. Finally, new resonance physics 
most readily contributes to the high-py region. 

For the present analysis, the theoretical calculation for 
high-pT top quarks was performed at next-to-leading- 
order (NLO) using the MCFM 6.3 program [8] based on 
the CT10 parton density functions (PDF) [9]. The renor- 
malization (/ir) and factorization (/L*f) scales were var- 
ied between m(top) — m(top)/2 and m(top) + m(top)/2, 
keeping the renormalization and factorization scales to 
be the same. The PDF uncertainty was calculated from 
53 CT10 PDF sets. A check was performed with the 
P0WHEG program [10] which uses a pt dependent (dy- 
namic) scale (which is considered to be more appropriate 
at large px(top)). It was found that this model is in 
good agreement with the MCFM prediction assuming the 
estimated renormalization and factorization scale uncer- 
tainties. 

Near partonic threshold for tt production the contri- 
butions from soft-gluon emission become dominant. The 
soft-gluon corrections to the double-differential top cross 
section in transverse momentum and rapidity can be 



rcsummed at next-to-next-to-leading-logarithm (NNLL) 
accuracy via the two-loop soft anomalous dimension ma- 
trices [11]. The resummed result has been expanded at 
fixed order to next-to-next-to-leading order (NNLO) and, 
after integration over rapidity, used to calculate the top 
quark transverse momentum distribution, da/dpx- This 
approximate next-to-next-to-leading-order (aNNLO) cal- 
culation from NNLL soft-gluon resummation leads to a 
factor of two larger tt cross section at large pr (top) com- 
pared to NLO. 

Figure 1 shows the NLO and aNNLO cross sections 
for top quarks from the tt process in pp collisions at the 
center-of-mass energy s/s = 14 TeV. The cross sections 
are presented as a function of the transverse momen- 
tum cut in the pseudorapidity region \tj\ < 0.8. The 
expected PDF uncertainty is about 20% (shown as filled 
band on Fig. 1), while the renormalization scale uncer- 
tainty is smaller. For a comparison, the cross section 
for t/s = 8 TeV is also shown but without uncertain- 
ties. Assuming an integrated luminosity of 10 fb _1 , the 
NLO calculation predicts 3500 top quarks in the all de- 
cay channel in the fiducial volume pr(top) > 0.8 TeV. 
This number is expected to increase to 5920 top quarks 
for the aNNLO. The contribution to the top quark yield 
from single-top production (t— channel [12], s— channel 
[13], and Wt production [14]) is expected to be smaller 
[11] than for the tt process. 

Although the main focus of this study are jets with 
Pr(jet) > 0.8 TeV, it should be pointed out that contri- 
butions to such jets from top quarks with pt (top) lower 
than 0.8 TeV are possible due to jet-energy resolution 
effects. In order to take into account such effects, top 
quarks were generated at lower px(top) than the min- 
imum pr(jet) = 0.8 TeV used in this analysis. In the 
following studies, top quarks were generated using MC 
models with py(top) > 0.65 TeV. Then their rate was 
scaled to 15690 top quarks as predicted by the aNNLO 
for the fiducial region px(top) > 0.65 TeV assuming an 
integrated luminosity of 10 fb _1 . 



B. Monte Carlo simulations 

Top quark jets in pp collisions were modeled using 
PYTHIA8 [16] and HERWIG++ [17] MC models assuming pp 
collisions at a center-of-mass energy of y/s = 14 TeV. 
As discussed above, the number of top quarks in the 
fiducial region pt (top) > 0.65 TeV was scaled to the 
aNNLO cross section assuming an integrated luminosity 
of 10 lb -1 . 

In addition to the top-quark initiated jets, QCD back- 
ground due to jets originating from light-flavor quarks 
and gluons were considered. Hadronic jets from all QCD 
processes (but excluding the tt production), were gen- 
erated using PYTHIA8 and HERWIG++. The MC inclusive 
cross section of jets was corrected to match the NLO pre- 
diction estimated with the NL0jet++ program [18]. The 
estimated scaling factor was found to be close to 10%. 
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FIG. 2. Expectations for the jet mass distribution initiated 
by top quarks using PYTHIA8 and HERWIG++ after the fast de- 
tector simulation. The jet selection cuts are pr(jet) > 0.8 
TeV and |»j(jet)| < 0.8. The number of initial top quarks is 
normalized to the aNNLO for pr(top) > 0.65 TeV. The ex- 
pected number of top jets shown in this figure is 3,500, with 
2,200 in the Gaussian core 140 < M jet < 200 GeV. The jet 
masses generated with PYTHIA8 were fitted in the mass range 
100-210 GeV using a Crystal Ball function [IS]. The bottom 
plot shows the fit residuals. The fit has x 2 / n df=l-3. 



II. RESULTS 

A. Masses of top jets 

As is well known, jet masses are sensitive to the pres- 
ence of top-quark decays. Figure 2 shows the masses 
(Mjet) of jets initiated by top quarks ("top jets") us- 
ing PYTHIA8 and HERWIG++ after the fast detector simu- 
lation. The jet selection cuts are pr(jet) > 0.8 TeV and 
|??(jet)| < 0.8. The jet masses include contributions from 
all-top decays (including leptonic decays of W bosons). 
The jet mass distribution can be described by a Crystal 
Ball function [15] which has a Gaussian core (with a mean 
mo and a width a) and a power-law tail with an expo- 
nent n to account for energy losses of hadronic decays or 
leptonic W decays. The parameter a defines the transi- 
tion between the Gaussian and the power-law functions. 
Figure 2 shows the fit with the Crystal Ball function us- 
ing PYTHIA8 . The peak position of the Gaussian compo- 
nent, which is intended to describe fully-hadronic decays, 
is close to 180 GeV with the width a ~ 20 GeV. Figure 2 
shows that the difference in shapes between PYTHIA8 and 
HERWIG++is small and thus can be neglected. 



B. Jet masses for light-flavor jets 



The samples for tt and for the QCD dijet background 
events were processed through a fast detector simula- 
tion based the DELPHES 2.0.3 framework [19] assuming 
the ATLAS detector geometry. The most crucial in such 
simulation are detector resolutions for hadronic and elec- 
tromagnetic calorimeters of the ATLAS detector. Those 
were taken from the default DELPHES setting based on the 
ATLAS studies [20, 21]. 



C. Jet mass reconstruction 

Events after the fast detector simulation were selected 
if they contain at least one jet reconstructed with the 
anti-fc<r algorithm [22] with a distance parameter of 0.6. 
This distance parameter is the most optimal to collect 
the decay products of hadronically decaying top quarks 
inside jets with p T (jet) > 0.8 TeV [7]. Jets were re- 
constructed with the Fast Jet package [23] using the 
DELPHES calorimeter cell positions and energies. 

The final jets were selected with pr(jet) > 0.8 TeV 
and 1 77 (jet) I < 0.8. For the current analysis, the central 
calorimeter region is used in order to avoid biases in the 
reconstruction of jet shapes and in order to increase the 
signal over background ratio for boosted top searches: 
for px(top) > 0.8 TeV, top quarks from the tt process 
are predominantly produced in the very central rapidity 



The mass distribution of jets originating from light 
quarks and gluons is distinct from jet masses initiated 
by top quarks. Figure 3(a) shows the Mj e t distribu- 
tions for light-flavor QCD jets (without the tt process) 
for PYTHIA8 and HERWIG++ after the fast detector simula- 
tion. The number of light-flavor jets was scaled to the 
expectation from the NL0jet++ program as discussed be- 
fore. Events with M^+jet events where also studied in the 
context of a possible contribution to the jet-mass shape. 
It was shown that W+jet events do not distort the region 
near M jct ~ 170 - 180 GeV. 

The jet masses for light jets can reasonably be de- 
scribed by the functional form a ■ Mj~ t h • exp (— c • Mj et ), 
where a, b and c are free parameters. A similar func- 
tion was previously used in the measurement of hadronic 
W/Z decays in two-jet mass spectra [24]. A fit using this 
function provides a MC independent way to search for 
any significant deviations from the jet mass shape which 
is expected to be falling in the tails. The fit residuals 
for PYTHIA8 and HERWIG++show no significant deviation 
from zero. 

The inclusion of top quarks modifies Mjet near the 170 
GeV region. Figure 3(b) shows the expectation for Mj e t 
assuming the contribution from the tt process. The top 
jets were simulated using PYTHIA8 , while their yield was 
scaled to the aNNLO calculation. The fit using a ■ M-~^ ■ 
exp (— c • Mj e t) was performed in the range 100 < Mj e t < 
270 GeV. The fit residuals do not show any significant 



4 





FIG. 3. Expectations for the jet-mass distributions for the 
MC models after the fast detector simulation. The rate of 
light-flavor jets is scaled to the NLO prediction for inclusive 
jets. The jet masses are shown for (a) assuming no ti process 
and, (b) with the ti process included. A \ 2 fit was performed 
using the background function a ■ ■ exp (— c • Mj et ) in the 
mass range 100 < Mj e t < 270 GeV. The jet mass prediction 
shown in (b) as shaded histograms is based on PYTHIA8 ti 
scaled to the aNNLO. The fit quality is x 2 /ndf=1.9 for (a) 
and x 2 / ndf =2.1 for (b). 



excess above zero, indicating that the extraction of the 
top signal assuming the nominal aNNLO yield for ti can 
be challenging. 

The situation is different if the top-quark yield is 
somewhat larger than the ti expectation. For example, 
Fig. 4(a) shows what happens when the top signal has a 
factor of two larger cross section than the aNNLO pre- 
diction shown before. The signal is difficult to miss; the 



FIG. 4. (a) Expectations for the jet mass distributions us- 
ing PYTHIA8 and HERWIG++ after the fast detector simulation. 
For the simulation, top quarks were added to light-flavor jets. 
The QCD dijet background was scaled to the NLO inclusive 
jet cross section. A \ 2 fit was performed using the back- 
ground function a ■ Mj~ t ■ exp (— c • Mj e t) in the mass range 
100 < A/jot < 270 GeV. (a) The PYTHIA8 expectation with the 
normalisation from the aNNLO for ti was scaled by a factor 
two. (b) The same distribution using the ti signal yield pre- 
dicted by the aNNLO after applying the 6-tagging for back- 
ground and top jets. The fit quality using the background 
function is x 2 /ndf=2.7 for (a) and x 2 / n df=3.5 for (b). 



residuals of the fit near M- jct ~ 180 GeV show an ex- 
cess above zero and have rather characteristic S'-shape 
form due to the pull from the signal region. This is more 
apparent for HERWIG++than for PYTHIA8, indicating a 
model dependence of this observation. 

Another way of looking at the effect of top quarks on 
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FIG. 5. The distributions of jet masses forpT(jet) > 0.8 TeV 
and 1 77 ( j et ) | < 0.8 for MCs after the fast detector simulation. 
The jet masses include contributions from tt assuming that 
the tt cross section is a factor two larger than the aNNLO 
cross section. A simultaneous \ 2 n t was performed in the 
mass range 100 < Mj e t < 270 GeV using the function a ■ 
Mr t 6 • exp (— c • Mjet) for the background description plus a 
Gaussian to describe the excess near 170 GeV. To improve the 
fit stability, the width of the Gaussian is fixed to 20 GeV as 
expected for top jets. The bottom plots show the fit residuals 
with respect to the fitted signal plus background function, 
as well as with respect to the background component of the 
combined fit. 



the jet mass distribution is to reduce the contribution of 
light-flavor jets using a ^-tagging technique. Figure 4(b) 
shows the jet masses with the nominal tt signal strength, 
but after applying a 6-tagging using the DELPHES [19] 
setting. It assumes a 40% 6— quark reconstruction effi- 
ciency, 10% and 1% misstag rates due to c— quark and 



light-flavor jets, respectively. The 6-tagging increases the 
signal-over-background ratio and the tt signal is clearly 
observed. 

The scenario when the cross section of boosted top 
quarks is higher than the tt prediction was further studied 
in Fig. 5 when a potential excess of top jets near the 170 
GeV region is extracted using a signal plus background 
function. As before, light-flavor jets were combined with 
top jets from the tt process. For this hypothetical sce- 
nario, the yield of the latter process was scaled by a factor 
of two with respect to the aNNLO prediction. The signal 
function is assumed to be a Gaussian with the width of 
20 GeV as expected for the top jets (see Fig. 2). 

The number of top quarks included in the simulation 
for pr(top) > 0.65 TeV was 11,840 (5920 top quarks 
from the aNNLO times two, see Sect. I A). This leads 
to 4,400 top jets with prijet) > 0.8 TeV contributing 
to the Mjet - 170 GeV region (2,200 top jets in the 
Gaussian core shown in Fig. 2 times two). According 
to the fit shown in Fig. 5, the number of extracted top 
jets is between 2,000-3,400, depending on the MC sim- 
ulation. This number was extracted by integrating the 
Gaussian component of the background plus signal fit. 
Thus, the extracted number of top jets is close to the ex- 
pected number of top jets included in the simulation, but 
there is some indication that the signal-plus-background 
fit somewhat underestimates the number of top jets. 

While the scenario when the number of top jets is a fac- 
tor of two larger than the aNNLO prediction for tt may 
seem exotic at first, such an assumption may not be too 
far from the Standard Model expectation for top quarks 
produced inclusively within a jet (see the discussion in 
Sect. I A). Given the large difference between the aNNLO 
and NLO [11], higher-order QCD effects for the ti pro- 
cess may play a significant role in an increase of top-quark 
jets at very large pT(jet). It is also important to mention 
that theoretical uncertainties, especially those related to 
PDF, can be as large as 20% (see Fig. 1). Less understood 
contributions from single-top production (about 30% at 
lower pr(top)), flavor-changing processes and from frag- 
mentation within jets should also be considered for the 
inclusive production of top quarks inside jets. Taking 
into account all such effects, our conjecture about the 
factor two may not be too far from the real situation. 
Therefore, a better understanding of all Standard Model 
processes leading to top production at high pr(jet) is 
needed. 

One can also consider the discussed result from the 
point of view of discovery reach. The approach can be 
used to exclude any potential source of new physics for 
a number of models (such as those based on Z' and W' 
bosons) leading to top quarks at large pT(jet). From the 
above consideration, any source of new physics can be 
excluded if it leads to a top-quark cross section above 
1184 fb in the fiducial region prQet) > 0.8 TeV and 
1 77 (jet) | < 0.8. This cross section is obtained from the 
aNNLO tt prediction multiplied by a factor two. Note 
that the approach can exclude a number of exotic pro- 
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cesses. For example, models with Z' and KaluzaKlein 
gluons may have larger cross sections compared to the 
Standard Model ti process at very large pr(jet). As a 
consequence, a number of limits have been set [4, 25] 
excluding such models up to 1.5 — 2 TeV without ex- 
perimental observation of top quarks from the Standard 
Model ti process at pr(top) > 0.6 TeV. 

The high-precision studies of jet mass using analytic 
background templates may seem difficult from the instru- 
mental point of view since we are looking for a top quark 
signal on top of a smoothly falling distribution which has 
a signal-over-background ratio at the level of 10% for the 
ti process. However, the assessment of systematics on 
the presence of a bump must have a different strategy 
than for a typical jet- mass measurement. Any variation 
of selection cuts or change in the instrumental procedure 
should be followed by the data-driven approach using 
the analytic fit to identify a bump after each system- 
atic change, unlike a typical QCD measurement of jet 
masses. For example, jet-energy scale variation should 
lead to a change of jet masses, but the signal strength af- 
ter the signal-plus-background fit should not be strongly 
affected given the data-driven nature of such extraction. 

Finally, a possibility of using other techniques based 
on 6-tagging, jet shapes and jet substructure can be con- 
sidered, which can also help to deal with some exper- 
imentally unavoidable effects, such as pile up. These 
techniques have the potential to increase the signal over 
background ratio for Mj e t close to 170 GeV when dealing 
with high-pT inclusive jets. This has been illustrated in 
Fig. 4(b) when considering jets after the ^-tagging. As 
follows from this study, if the QCD multijet background 
is reduced at least by a factor of two compared to the top- 
quark signal, the tt process should be well observed for 
the yield expected from the aNNLO calculation. Studies 
of such techniques are outside the scope of this paper and 
can be found elsewhere [6, 7]. 

III. CONCLUSIONS 

This paper shows that jet masses alone, without any 
complicated techniques involving substructure variables, 



already provide a sensitive probe for inclusively produced 
top quarks within high-p^ jets. Due to the nature of the 
inclusive measurement, such technique is not based on 
tagging of top quarks in the opposite direction. The ap- 
proach allows to study top quarks using the assumption 
that the background fit function has a smoothly falling 
shape and does not contain a hump near the 170 GeV 
region, thus it can be modeled analytically. 

As shown in this paper, the method has the poten- 
tial to detect highly-boosted top quarks if their yield is 
a factor of two or more larger than that from the best- 
understood ti process assuming the aNNLO prediction. 
This observation also implies that any technique capable 
of reducing QCD background near Mj 0t ~ 170 GeV at 
least by a factor of two should be sufficient for the obser- 
vation of boosted top quarks from the Standard Model 
ti process. There are other sources for inclusive produc- 
tion of top quarks for very large pxijet), but their good 
understanding requires further studies. Once they are 
understood, any enhancement of top-quark cross section 
over the Standard Model prediction would be indicative 
of the presence of new resonances at the TeV scale. 
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